Enhanced Extraction from Huffman Encoded Files

نویسندگان

  • Shmuel Tomi Klein
  • Dana Shapira
چکیده

Given a file T , and the Huffman encoding of its elements, we suggest using a pruning technique for Wavelet trees that enables direct access to the i-th element of T by reordering the bits of the compressed file and using some additional space. When compared to a traditional Wavelet tree for Huffman Codes, our different reordering of the bits usually requires less additional storage overhead by reducing the need for auxiliary rank structures, while improving processing time for extracting the i-th element of T .

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Parallel Huffman Decoding with Applications to JPEG Files

A simple parallel algorithm for decoding a Huffman encoded file is presented, exploiting the tendency of Huffman codes to resynchronize quickly, i.e. recovering after possible decoding errors, in most cases. The average number of bits that have to be processed until synchronization is analyzed and shows good agreement with empirical data. As Huffman coding is also a part of the JPEG image compr...

متن کامل

Better Huffman Coding via Genetic Algorithm

We present an approach to compress arbitrary files using a Huffman-like prefix-free code generated through the use of a genetic algorithm, thus requiring no prior knowledge of substring frequencies in the original file. This approach also enables multiple-character substrings to be encoded. We demonstrate, through testing on various different formats of real-world data, that in some domains, th...

متن کامل

Speeding Up String Pattern Matching by Text Compression: The Dawn of a New Era

This paper describes our recent studies on string pattern matching in compressed texts mainly from practical viewpoints. The aim is to speed up the string pattern matching task, in comparison with an ordinary search over the original texts. We have successfully developed (1) an AC type algorithm for searching in Huffman encoded files, and (2) a KMP type algorithm and (3) a BM type algorithm for...

متن کامل

A Lossless re-Encoding of MPEG-2 Coded file by Integrating Four Motion Vectors

Re-encoding of once compressed files is one of the difficult challenges in measuring the efficiency of coding methods. Variable length coding with a variable source delimiting scheme is a promising method for improving re-encoding efficiency. Analyses of coded files with fixed length delimiting and with variable length delimiting are reviewed. Motion vector codes of MPEG-2 encoded files are mod...

متن کامل

Identification and Recovery of JPEG Files with Missing Fragments

Recovery of fragmented files proves to be a challenging task for encoded files like JPEG. In this paper, we consider techniques for addressing two issues related to fragmented JPEG file recovery. First issue concerns more efficient identification of the next fragment of a file undergoing recovery. Second issue concerns the recovery of file fragments which cannot be linked to an existing image h...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015